1,858 research outputs found

    Generating and Adapting to Diverse Ad-Hoc Cooperation Agents in Hanabi

    Full text link
    Hanabi is a cooperative game that brings the problem of modeling other players to the forefront. In this game, coordinated groups of players can leverage pre-established conventions to great effect, but playing in an ad-hoc setting requires agents to adapt to its partner's strategies with no previous coordination. Evaluating an agent in this setting requires a diverse population of potential partners, but so far, the behavioral diversity of agents has not been considered in a systematic way. This paper proposes Quality Diversity algorithms as a promising class of algorithms to generate diverse populations for this purpose, and generates a population of diverse Hanabi agents using MAP-Elites. We also postulate that agents can benefit from a diverse population during training and implement a simple "meta-strategy" for adapting to an agent's perceived behavioral niche. We show this meta-strategy can work better than generalist strategies even outside the population it was trained with if its partner's behavioral niche can be correctly inferred, but in practice a partner's behavior depends and interferes with the meta-agent's own behavior, suggesting an avenue for future research in characterizing another agent's behavior during gameplay.Comment: arXiv admin note: text overlap with arXiv:1907.0384

    Neural Task Programming: Learning to Generalize Across Hierarchical Tasks

    Full text link
    In this work, we propose a novel robot learning framework called Neural Task Programming (NTP), which bridges the idea of few-shot learning from demonstration and neural program induction. NTP takes as input a task specification (e.g., video demonstration of a task) and recursively decomposes it into finer sub-task specifications. These specifications are fed to a hierarchical neural program, where bottom-level programs are callable subroutines that interact with the environment. We validate our method in three robot manipulation tasks. NTP achieves strong generalization across sequential tasks that exhibit hierarchal and compositional structures. The experimental results show that NTP learns to generalize well to- wards unseen tasks with increasing lengths, variable topologies, and changing objectives.Comment: ICRA 201

    A functorial approach to monomorphism categories II: Indecomposables

    Full text link
    We investigate the (separated) monomorphism category mono(Q,Λ)\operatorname{mono}(Q,\Lambda) of a quiver over an Artin algebra Λ\Lambda. We show that there exists a representation equivalence in the sense of Auslander from mono(Q,Λ)\overline{\operatorname{mono}}(Q,\Lambda) to rep(Q,modΛ)\operatorname{rep}(Q,\overline{\operatorname{mod}}\, \Lambda), where modΛ\operatorname{mod}\Lambda is the category of finitely generated modules and modΛ\overline{\operatorname{mod}}\, \Lambda and mono(Q,Λ)\overline{\operatorname{mono}}(Q,\Lambda) denote the respective injectively stable categories. Furthermore, if QQ has at least one arrow, then we show that this is an equivalence if and only if Λ\Lambda is hereditary. In general, the representation equivalence induces a bijection between indecomposable objects in rep(Q,modΛ)\operatorname{rep}(Q,\overline{\operatorname{mod}}\, \Lambda) and non-injective indecomposable objects in mono(Q,Λ)\operatorname{mono}(Q,\Lambda), and we show that the generalized Mimo-construction, an explicit minimal right approximation into mono(Q,Λ)\operatorname{mono}{(Q,\Lambda)}, gives an inverse to this bijection. We apply these results to describe the indecomposables in the monomorphism category of a radical-square-zero Nakayama algebra, and to give a bijection between the indecomposables in the monomorphism category of two artinian uniserial rings of Loewy length 33 with the same residue field. The main tool to prove these results is the language of a free monad of an exact endofunctor on an abelian category. This allows us to avoid the technical combinatorics arising from quiver representations. The setup also specializes to yield more general results, in particular in the case of representations of (generalised) speciesComment: 41 pages. Comments welcome

    Optimization and Abstraction: A Synergistic Approach for Analyzing Neural Network Robustness

    Full text link
    In recent years, the notion of local robustness (or robustness for short) has emerged as a desirable property of deep neural networks. Intuitively, robustness means that small perturbations to an input do not cause the network to perform misclassifications. In this paper, we present a novel algorithm for verifying robustness properties of neural networks. Our method synergistically combines gradient-based optimization methods for counterexample search with abstraction-based proof search to obtain a sound and ({\delta}-)complete decision procedure. Our method also employs a data-driven approach to learn a verification policy that guides abstract interpretation during proof search. We have implemented the proposed approach in a tool called Charon and experimentally evaluated it on hundreds of benchmarks. Our experiments show that the proposed approach significantly outperforms three state-of-the-art tools, namely AI^2 , Reluplex, and Reluval

    Distributionally Robust Optimization

    Get PDF
    This chapter presents a class of distributionally robust optimization problems in which a decision-maker has to choose an action in an uncertain environment. The decision-maker has a continuous action space and aims to learn her optimal strategy. The true distribution of the uncertainty is unknown to the decision-maker. This chapter provides alternative ways to select a distribution based on empirical observations of the decision-maker. This leads to a distributionally robust optimization problem. Simple algorithms, whose dynamics are inspired from the gradient flows, are proposed to find local optima. The method is extended to a class of optimization problems with orthogonal constraints and coupled constraints over the simplex set and polytopes. The designed dynamics do not use the projection operator and are able to satisfy both upper- and lower-bound constraints. The convergence rate of the algorithm to generalized evolutionarily stable strategy is derived using a mean regret estimate. Illustrative examples are provided

    Calculating the entropy loss on adsorption of organic molecules at insulating surfaces

    Get PDF
    Although it is recognized that the dynamic behavior of adsorbing molecules strongly affects the entropic contribution to adsorption free energy, detailed studies of the adsorption entropy of large organic molecules at insulating surfaces are still rare. We compared adsorption of two different functionalized organic molecules, 1,3,5-tri(4-cyano-4,4-biphenyl)benzene (TCB) and 1,4-bis(cyanophenyl)-2,5-bis(decyloxy)benzene (CDB), on the KCl(001) surface using density functional theory (DFT) and molecular dynamics (MD) simulations. The accuracy of the van der Waals corrected DFT-D3 was benchmarked using Møller–Plesset perturbation theory calculations. Classical force fields were then parametrized for both the TCB and CDB molecules on the KCl(001) surface. These force fields were used to perform potential of mean force (PMF) calculations of adsorption of individual molecules and extract information on the entropic contributions to adsorption energy. The results demonstrate that entropy loss upon adsorption are significant for flexible molecules. Even at relatively low temperatures (e.g., 400 K), these effects can match the enthalpic contribution to adsorption energ
    corecore